2,479 research outputs found
Guidelines for annotating the LUNA corpus with frame information
This document defines the annotation workflow aimed at adding frame information to the LUNA corpus of conversational speech. In particular, it details both the corpus pre-processing steps and the proper annotation process, giving hints about how to choose the frame and the frame element labels. Besides, the description of 20 new domain-specific and language-specific frames is reported. To our knowledge, this is the first attempt to adapt the frame paradigm to dialogs and at the same time to define new frames and frame elements for the specific domain of software/hardware assistance. The technical report is structured as follows: in Section 2 an overview of the FrameNet project is given, while Section 3 introduces the LUNA project and the annotation framework involving the Italian dialogs. Section 4 details the annotation workflow, including the format preparation of the dialog files and the annotation strategy. In Section 5 we discuss the main issues of the annotation of frame information in dialogs and we describe how the standard annotation procedure was changed in order to face such issues. Then, the 20 newly introduced frames are reported in Section 6
An Open-Domain Dialog Act Taxonomy
This document defines the taxonomy of dialog acts that are necessary to encode domain-independent dialog moves in the context of a task-oriented, open-domain dialog. Such taxonomy is formulated to satisfy two complementary requirements: on the one hand, domain independence, i.e. the power to cover all the range of possible interactions in any type of conversation (particularly conversation oriented to the performance of tasks). On the other hand, the ability to instantiate a concrete set of tasks as defined by a specific knowledge base (such as an ontology of domain concepts and actions) and within a particular language. For the modeling of dialog acts, inspiration is taken from several well-known dialog annotation schemes, such as DAMSL (Core & Allen, 1997), TRAINS (Traum, 1996) and VERBMOBIL (Alexandersson et al., 1997)
Response Generation in Longitudinal Dialogues: Which Knowledge Representation Helps?
Longitudinal Dialogues (LD) are the most challenging type of conversation for
human-machine dialogue systems. LDs include the recollections of events,
personal thoughts, and emotions specific to each individual in a sparse
sequence of dialogue sessions. Dialogue systems designed for LDs should
uniquely interact with the users over multiple sessions and long periods of
time (e.g. weeks), and engage them in personal dialogues to elaborate on their
feelings, thoughts, and real-life events. In this paper, we study the task of
response generation in LDs. We evaluate whether general-purpose Pre-trained
Language Models (PLM) are appropriate for this purpose. We fine-tune two PLMs,
GePpeTto (GPT-2) and iT5, using a dataset of LDs. We experiment with different
representations of the personal knowledge extracted from LDs for grounded
response generation, including the graph representation of the mentioned events
and participants. We evaluate the performance of the models via automatic
metrics and the contribution of the knowledge via the Integrated Gradients
technique. We categorize the natural language generation errors via human
evaluations of contextualization, appropriateness and engagement of the user
- …